Mirai Music Information Retrieval Based on Automatic Indexing

نویسندگان

  • Rory A. Lewis
  • Mirsad Hadzikadic
چکیده

Increasing growth and popularity of multimedia resources available on the Web brought the need to provide new, more advanced tools needed for research. However, searching through multimedia data is highly non-trivial task that requires content-based indexing of the data. My research will focus on automatic extraction of information about the sound timbre, and indexing sound data with information about musical instrument playing in a given segment. Sound timbre is a very important factor that can affect the perceptual grouping of music. The real use of timbre-based grouping of music is very nicely discussed in (Bregman, 1990). The aim is to perform automatic classification of musical instrument sound from real recordings for a broad range of sounds, independently of the fundamental frequency of the sound. My thesis will focus on musical instruments of definite pitch, used in contemporary orchestras and bands. Full range of musical scale for each instrument will be investigated. The investigation will start with descriptors depicted in MPEG-7. Although MPEG-7 provides some tools for indexing with musical instrument names, this information is inserted rather manually (for instance, tracks are labeled with voices/instruments in recording studios). There are no algorithms included in MPEG-7 to automate this task. In order to index enormous amount of audio files of various origin which are available for users on the Web, special processing and new algorithms are needed to extract this kind of knowledge directly from audio signals. The Music Information Retrieval Based on Automatic Indexing will be called MIRAI and will be based on low-level descriptors that can be easily extracted automatically for any audio signal. Apart from observing descriptor set for a given frame, it will also trace descriptor changes in time. Finally, if MPEG-7 becomes commonly used as standard, the results of this research will provide its interoperability for various applications in the music domain. Automatic sound indexing should allow labeling sound segments with instrument names. The MIRAI implementation will start with singular, homophonic sounds of musical instruments, and then extend my investigations to simultaneous, polyphonic sounds. Knowledge discovery techniques will be applied at this stage of research. First of all, we have to discover rules that recognize various musical instruments. Next, we apply these rules, one by one, to unknown sounds. By identifying so called supporting rules, we should be able to point out which instrument is playing (or is dominating) in a given segment, and in what time instants this instrument starts and ends playing. Additionally, MIRAI will extract sound parameters called pitch information which is one of the important factors in sound classification. By combining melody and timbre information, MIRAI should be able to search successfully for favorite tunes played by favorite instruments. The Significance of thesis: The MIRAI thesis will advance research on automatic content extraction from audio data, with application to full-band musical sounds, as opposed to quite broad research on speech signal, usually limited in frequency range. Investigating on automatic indexing of instrumental recordings will also allow formalizing the description of sound timbre for musical instruments. There is a number of different approaches to sound timbre (for instance (Balzano, 1986) or (Cadoz, 1985)). Dimensional approach to timbre description was proposed by (Bregman, 1990). Timbre description is basically subjective and vague, and only some subjective features have well defined objective counterparts, like brightness, calculated as gravity center of the spectrum. Explicit formulation of rules of objective specification of timbre in terms of digital descriptors will formally express subjective and informal sound characteristics. It is especially important in the light of human perception of sound timbre. Timevariant information is necessary for correct classification of musical instrument sounds, because quasisteady state itself is not sufficient for human experts. Therefore, evolution of sound features in time should be reacted in sound description as well. The discovered temporal patterns may better express sound features than static features, especially that classic features can be very similar for sounds representing the same family or pitch, whereas changeability of features with pitch for the same instrument makes sounds of one instrument dissimilar. Therefore, classical sound features can make correct identification of musical instrument independently on the pitch very difficult and erroneous. This research represents the first application of discovering temporal patterns in time evolution of MPEG-7 based, low-level sound descriptors of musical instrument sounds, with application to simultaneous sounds. KDD methods applied to extraction of temporal patterns and searching for the best classifier (quite successful in other domains, i.e. business, medicine) will aid signal analysis methods used as a preprocessing tool and will contribute to development of my knowledge on musical timbre. I will also perform research on construction of new attributes in order to find the best representation for sound recognition purposes. In recent years, there has been a tremendous need for the ability to query and process vast quantities of musical data, which are not easy to describe with mere symbols. Automatic content extraction is clearly needed here and it relates to the ability of identifying the segments of audio in which particular instruments are playing. It also relates to the ability of identifying musical pieces representing different types of emotions, which music clearly evokes, or generating humanlike expressive performances (Mantaras and Arcos, 2002). Automatic content extraction may relate to many different types of semantic information related to musical pieces. Some information can be stored as metadata provided by experts, but some has to be computed in an automatic way. I believe that my approach based on KDD techniques will advance research to automatic content extraction, not only in identifying the segments of audio in which particular instruments are playing, but also in identifying the segments of audio containing other, more complex semantic information. Background for the MIRAI: In recent years, automatic indexing of multimedia data became an area of considerable research interest because of the need for quick searching of digital multimedia files. Broad access to the Internet, available for millions of users, creates significant market for the products dealing with content-based searching through multimedia files. The domain of image processing and content extraction is extensively explored all over the world and there are numerous publications available on that topic. Automatic extraction of audio content is not that much explored, especially for musical sounds. Methods in the Research on Musical Instrument Sound Classification: Broader research on automatic musical instrument sound classification goes back to last few years. So far, there is no standard parameterization used as a classification basis. The sound descriptors used are based on various methods of analysis of time and spectrum domain, with Fourier Transform for spectral analysis being most common. Also, wavelet analysis gains increasing interest for sound and especially for musical sound analysis and representation, see for instance (Popovic, Coifman and Berger, 1995), (Goodwin, 1997). Diversity of sound timbres is also used to facilitate data visualization via sonification, in order to make complex data easier to perceive (Ben-Tal, Berger, B. Cook, Daniels, Scavone and P. Cook, 2002). Many parameterization and recognition methods, including pitch extraction techniques, applied in musical research come from speech and speaker recognition domain (Flanagan, 1972), (Rabiner and Schafer, 1978). Sound parameters applied in research performed so far in musical instrument classification include cepstral coeficients, constant-Q coefficients, spectral centroid, autocorrelation coeficients, and moments of the time wave (Brown, Houix and McAdams, 2001), wavelet analysis (Wieczorkowska, 2001), (Kostek and Czyzewski 2001), root mean square (RMS) amplitude envelope and multidimensional scaling analysis trajectories (Kaminskyj, 2000), and various

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MIRAI: Multi-hierarchical Music Automatic Indexing and Retrieval System

Recently, numerous successful approaches have been developed for instrument recognition in monophonic sounds. Unfortunately, none of them can be successfully applied to polyphonic sounds. Identification of music instruments in polyphonic sounds is still difficult and challenging. This has stimulated a number of research projects on music sound separation and new features development for content...

متن کامل

MIRAI: Multi-hierarchical, FS-Tree Based Music Information Retrieval System

With the fast booming of online music repositories, there is a need for content-based automatic indexing which will help users to find their favorite music objects in real time. Recently, numerous successful approaches on musical data feature extraction and selection have been proposed for instrument recognition in monophonic sounds. Unfortunately, none of these methods can be successfully appl...

متن کامل

Music Information Retrieval with Polyphonic Sounds and Timbre

With the fast booming of online music repositories, the problem of building music recommendation systems is of great importance. There is an increasing need for content-based automatic indexing to help users find their favorite music objects. In this work, we propose a new method for automatic classification of musical instruments. We use a unique set of timbre related descriptors, extracted on...

متن کامل

Comparison Of Modified Dual Ternary Indexing And Multi-Key Hashing Algorithms For Music Information Retrieval

In this work we have compared two indexing algorithms that have been used to index and retrieve Carnatic music songs. We have compared a modified algorithm of the Dual ternary indexing algorithm for music indexing and retrieval with the multi-key hashing indexing algorithm proposed by us. The modification in the dual ternary algorithm was essential to handle variable length query phrase and to ...

متن کامل

Automatic labeling of tabla signals

Most of the recent developments in the field of music indexing and music information retrieval are focused on western music. In this paper, we present an automatic music transcription system dedicated to Tabla a North Indian percussion instrument. Our approach is based on three main steps: firstly, the audio signal is segmented in adjacent segments where each segment represents a single stroke....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006